RAW SEQUENCE LISTING 
ERROR REPORT 



BIOTECHNOLOGY S0 
SYSTEMS 
BRANCH 




HGH Center 



The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STIC) detected errors when processing the following computer readable 
form: 

Application 
Source: 



Serial Number: Q 9/slS. 32i A> 



Date Processed by STIC: 




&3J/2ool 



THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. 

PLEASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

1) INCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT COMMUNICATION TO THE 
APPLICANT, WITH A NOTICE TO COMPLY or, 

2) TELEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 
NOTICE TO COMPLY 

FOR CRF SUBMISSION QUESTIONS, PLEASE CONTACT MARK SPENCER, 703-308-4212. 



FOR SEQUENCE RULES INTERPRETATION, PLEASE CONTACT ROBERT WAX, 703-308-4216. 
PATENTIN 2.1 e-mail help: patin21help@uspto.gov or phone 703-306-4119 (R. Wax) 
PATENTED 3.0 e-mail help: Dntin3help@iisnto.gov or phone 703-306-4119 (R. Wax) 

TO REDUCE ERRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
VERSION 3.0 PROGRAM ACCESSIBLE THROUGH THE U.S. PATENT AND 
TRADEMARK OFFICE WEBSITE. SEE BELOW: 



Checker Version 3.0 

The Checker Version 3.0 application is a state-of the-art Windows based software program 
employing a logical and intuitive user-interface to check whether a sequence listing is in 
compliance with format and content rules. Checker Version 3.0 works for sequence listings 
generated for the original version of 37 CFR §§1.821 - 1.825 effective October 1, 1990 (old 
rules) and the revised version (new rules) effective July 1, 1998 as well as World Intellectual 
Property Organization (WIPO) Standard ST.25. 

Checker Version 3 .0 replaces the previous DOS-based version of Checker, and is Y2K- 
compliant. Checker allows public users to check sequence listings in Computer Readable form 
(CRF) before submitting them to the United States Patent and Trademark Office (USPTO). 
Use of Checker prior to filing the sequence listing is expected to result in fewer errored sequence 
listings, thus saving time and money. 



Checker Version 3.0 can be down loaded from the USPTO website at the following address: 

http://www.uspto.gov/web/offices/pac/checker 



RamSequence Listing Error Sirflmary 



ERROR DETECTED SUGGESTED CORRECTION 



SERIAL NUMBER: 



ATTN: NEW RULES CASES: PLEASE DISREGARD ENGLISH "ALPHA** HEADERS, WHICH WERE INSERTED BY PTO SOFTWARE 

Wrapped Nucleics The number/text at the end of each line "wrapped" down to the next line, -H 

This may occur if your file was retrieved in a word processor after creating it. 
Please adjust your right margin to .3, as this will prevent "wrapping". 



Wrapped Aminos The amino acid number/text at the end of each line "wrapped " down to the next line. 

This may occur if your file was retrieved in a word processor after creating it. 
Please adjust your right margin to .3, as this will prevent "wrapping". 

Incorrect Line Length The rules require that a line not exceed 72 characters in length. This includes spaces. 



Misaligned Amino Acid 
Numbering 

Non-ASCII 



Variable Length 



Patentln ver. 2.0 "bug" 



Skipped Sequences 
(OLD RULES) 



Skipped Sequences 
(NEW RULES) 



C3 
CD 

ro 

CO 



r>o 

CD 
CD 



The numbering under each 5th amino acid is misaligned. This may be caused by the use of tabg 
between the numbering. It is recommended to delete any tabs and use spacing between the numbers. 

This file was not saved in ASCII (DOS) text, as required by the Sequence Rules. 

Please ensure your subsequent submission is saved in ASCII text so that it can be processed. 

Sequence(s) contain n's or Xaa's which represented more than one residue. 

As per the rules, each n or Xaa can only represent a single residue. 

Please present the maximum number of each residue having variable length and 

indicate in the (ix) feature section that some may be missing. 

A "bug" in Patentln version 2.0 has caused the <220>-<223> section to be missing from amino acid 

sequence(s) . Normally, Patentln would automatically generate this section from the 

previously coded nucleic acid sequence. Please manually copy the relevant <220>-<223> section 
to the subsequent amino acid sequence. This applies primarily to the mandatory <220>-<223> 
sections for Artificial or Unknown sequences. 



Sequence(s) missing. If intentional, please use the following format for each skipped sequence: 

(2) INFORMATION FOR SEQ ID NO:X: 

(i) SEQUENCE CHARACTERISTICS:(Do not insert any headings under "SEQUENCE CHARACTERISTICS") 
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:X: 
This sequence is intentionally skipped 

Please also adjust the "(iii) NUMBER OF SEQUENCES:" response to include the skipped sequence(s). 

Sequence(s) missing. If intentional, please use the following format for each skipped sequence. 

<210> sequence id number 
<400> sequence id number 
000 



ZD 

m 
o 
m 

< 
m 
o 



10 



11 



12 



13 



Use of n's or Xaa's 
(NEW RULES) 



Use of "Artificial" 
(NEW RULES) 

Use of <220>Feature 
(NEW RULES) 



Patentln ver. 2.0 "bug" 



Use of n's and/or Xaa's have been detected in the Sequence Listing. 
Use of <220> to <223> is MANDATORY if n's or Xaa's are present. 

In <220> to <223> section, please explain location of n or Xaa, and which residue n or Xaa represents. 

Use of "Artificial" only as "<213> Organism" response is incomplete, per 1 .823(b) of New Sequence Rules. 
Valid response is Artificial Sequence. . , 

Seq uenr.pf^ are missing thP <??n>FM ti ire and associated hpariinps 



Use ( 



i of <220> to <223> is MANDATORY if <213>ORGANISM is "Artificial Sequence" or "Unknown" 
Please explain source of genetic material in <220> to <223> section. 

(See "Federal Register," 6/01/98, Vol. 63, No. 104, pp. 29631-32) (Sec. 1 .823 of new Rules) 



Please do not use "Copy to Disk" function of Patentln version 2.0. This causes a corrupted 

file, resulting in missing mandatory numeric identifiers and responses (as indicated on raw sequence listing). 

Instead, please use "File Manager" or any other means to copy file to floppy disk. 



AMC - Biotechnoloav Systems Branch - 4/06/2001 
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RAW SEQUENCE LISTING DATE( O5/31/.2Q0-1 /„ 

PATENT APPLICATION: US/09/595, 326A TIMET-12T15 : 54 i 

Input Set : C:\PTO.txt Does Not Complv 

output set: c:\cRF3\o53i2ooi\i595326A.raw Corrected Diskette Needed 

3 <110> APPLICANT: ALEXANDROV, Nickolai 

4 TROUKHAN, Maxim 

6 <120> TITLE OF INVENTION: Sequence-Determined DNA Fragments and Corresponding 
Polypeptides Encoded 

7 Thereby 

9 <130> FILE REFERENCE: 2750-0942P 

11 <140> CURRENT APPLICATION NUMBER: 09/595, 326A / / j) 

12 <141> CURRENT FILING DATE: 2000-06-16 wLdVW JlAA4/^ 

14 <160> NUMBER OF SEQ ID NOS : 769 V * " 

16 <170> SOFTWARE: Patentln version 3.0 



18 <210> SEQ ID NO 

19 <211> LENGTH: 5 



atentln version 3.0 i jO fl > i / J J 

20 <212> TYPE: PRT — * 1 (Jfr ISia a U)V\ 

21 <213> ORGANISM: Consensus Sequence^ y^yi^^^ ' Uin/dnO^r^ 
23 <4 00> SEQUENCE: 1" — ■ — *~ s i 



SEQUENCE: 

25 Leu He Val Met Thr 

26 1 5 

28 <210> SEQ ID NO: 2 

29 <211> LENGTH: 5 



30 <212> TYPE: PRT , ^ X f) \ 

31 <213> ORGANISM :/Consensus Sequenced A / l V m <-i V^ 0 b 

33 <4 00> SEQUENCE :V \^ / fM A ^^^^J 



35 Leu He Val Met Phe ^ / 

36 1 5 

38 <210> SEQ ID NO: 

39 <211> LENGTH: 5 

40 <212> TYPE: PR' 

41 <213> ORGANISM* 
43 <400> SEQUENCE: 

45 Gly Ala Thr Met 

46 1 

48 <210> SEQ ID NO 

49 <211> LENGTH: 4 

50 <212> TYPE: PRT^-^- 

51 <213> ORGANISN^Consensus 
53 <400> SEQUENCE :>- — . 

55 Leu He Val Met 

56 1 

58 <210> SEQ ID NO: 5 

59 <211> LENGTH: 4 

60 <212> TYPE: PR' 

61 <213> ORGANISM : N 
63 <400> SEQUENCE: 

65 Leu He Phe Ala 

66 1 

68 <210> SEQ ID NO 

69 <211> LENGTH: 7 

70 <212> TYPE: PRT 




JU) 




f ile ://C : \Crf 3\Outhold \VsrI595326A. htm 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/595 , 326A 



DATE: 05/31/2001 
TIME: 12:15:54 



Input Set : C:\PTO.txt 

Output S et: C: \CRF 3\05312001\I595326A, raw 




Consensus Sequence 



71 <213> ORGANIS: 
73 <400> SEQUENCE 

75 Leu lie Val Met Phe Tyr Cys 

76 1 5 

78 <210> SEQ ID NO: 7 

79 <211> LENGTH: 11 

<212> TYPE: PRT 



80 




81 <213> ORGANISM ^Consensus Sequence 

83 <400> SEQUENCE: 1 " 

85 Ser Ala Pro Gly Leu Val Phe Tyr Lys Gin His 



86 
88 



10 



8 



<210> SEQ ID NO: 

89 <211> LENGTH: 6 

90 <212> TYPE: PRT> — • 

91 <213> ORGANISMS^Consensus Sequence 
93 <4 00> SEQUENCE: 8"- ■ - 

95 Asp Glu Asn Gin Met Trp 

96 1 5 

98 <210> SEQ ID NO: 9 

99 <211> LENGTH: 12 ^ _ 

100 <212> TYPE: PRX ^~ "~ 

101 <213> ORGANISM 
103 <400> SEQUENCE: 




.Qonsensus Sequence 
9 




105 Lys Arg Gin Ala Ser Pro Cys Leu lie Met Phe Trp 



106 
108 



10 



<210> SEQ ID NO: 10 



^nsensus Sequence 



109 <211> LENGTH: 9 

110 <212> TYPE: PRJ 



\y> * 

111 <213> ORGANISM V*<1qj 
113 <400> SEQUENCE: 10 

115 Lys Arg Asn Gin Ser Thr Ala Val Met 

116 1 5 

118 <210> SEQ ID NO: 11 

119 <211> LENGTH: 7 

<212> TYPE 



120 



PRT- * ^ 

121 <213> ORGANISiyk^Consensus Sequence 
123 <400> SEQUENCE : l"! — 

125 Lys Arg Ala Cys Leu Val Met 

126 1 5 

128 <210> SEQ ID NO: 12 

129 <211> LENGTH: 9 ^ 

130 <212> TYPE: Ktf - ^ 

131 <213> ORGANISfcte^Consensus Sequence 
133 <400> SEQUENCE: l2 ' 

135 Leu He Val Met Phe Tyr Pro Ala Asn 

136 1 5 

138 <210> SEQ ID NO: 13 

139 <211> LENGTH: 6 

140 <212> TYPE: PRT 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/595 , 326A 



DATE: 05/31/2001 
TIME: 12:15:54 



Input Set : C:\PTO.txt 

Output Set: C:\CRF3\05312001\I595326A.raw 



141 

143 

145 

146 

148 

149 

150 

151 

153 

155 

156 

158 

159 

160 

161 

163 

165 

166 

168 

169 

170 

171 

173 

175 

176 

178 

179 

180 

181 

183 

185 

186 

188 

189 

190 

191 

193 

195 

196 

198 

199 

200 

201 

203 

205 

206 

208 

209 

210 




Consensus Sequence 



Trp 



<213> ORGANISE 
<4 00> SEQUENCE^ 
Leu He Val Met Phe 
1 5 
<210> SEQ ID NO: 14 
<211> LENGTH: 8 

<212> TYPE: PRT_^- — - 

<213> ORGANISK^Consensus Sequence 
<400> SEQUENCE:T*- - 




Ser Ala Gly Cys Leu He Val Pro 
1 5 
<210> SEQ ID NO: 15 
<211> LENGTH: 5 

<212> TYPE: PRT 



<213> ORGANISM(j6onsensus Sequence 

<400> SEQUENCE N3r§^^ _ 

Phe Tyr Trp His Pro 




16 



Consensus Sequence 
1? 



17 



<210> SEQ ID NO 
<211> LENGTH: 4 
<212> TYPE: PRT 
<213> ORGANISM: 
<400> SEQUENCE: 
Lys Arg His Pro 
1 

<210> SEQ ID NO: 
<211> LENGTH: 10 
<212> TYPE: PRT 

<213> ORGANISMC^ Qonsens us Sequence 
<400> SEQUENCE: 17 
Leu He Val Met Phe Tyr Trp Ser Thr 
1 5 
<210> 
<211> 
<212> 
<213> 
<400> 




Ala 
10 



SEQ ID NO 
LENGTH: 6 
TYPE: PRT 
ORGANISM 
SEQUENCE 



Consensus Sequence 



Leu He Val Met Phe Tyr 
1 5 
<210> SEQ ID NO: 19 

<211> LENGTH: 4 

<212> TYPE: PRT/^^T 
<213> ORGANISM :^Sc^isensus 
<400> SEQUENCE: 19 
Gly Ser Thr Ala 
1 

<210> SEQ ID NO: 
<211> LENGTH: 4 
<212> TYPE: PRT 




Sequence 




20 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/595 , 326A 



DATE: 05/31/2001 
TIME: 12:15:54 



Input Set : C:\PTO.txt 

Output Set: C:\CRF3\05312001\I595326A.raw 



21 



211 <213> ORGANISM :( Consensus Sequence 
213 <400> SEQUENCE: 

215 Ser Thr Ala Gly 

216 1 

218 <210> SEQ ID NO 

219 <211> LENGTH 

220 <212> TYPE: PRT 
<213> ORGANISM: V^onsensus Sequence 
<400> SEQUENCE 
Ser Thr Glu lie 
1 

SEQ ID NO: 22 
LENGTH: 7 
TYPE: PRT 

ORGANISM :VConsensus Sequence 



221 
223 
225 
226 
228 
229 
230 
231 



: \£ons< 
: 21 



<210> 
<211> 
<212> 
<213> 



233 <400> SEQUENCE 

235 Pro Ala Ser Leu 

236 1 

238 <210> SEQ ID NO 

239 <211> LENGTH: 4 

240 <212> TYPE: PRT 

241 <213> ORGANISM* 
243 <400> SEQUENCE 

Ser Ala Lys Arg 
1 

<210> SEQ ID NO: 

<211> LENGTH: 5 

<212> TYPE: PRX 

<213> ORGANISM 




24 



245 
246 
248 
249 
250 

251 <213> ORGANISM N^orisensus 
253 <400> SEQUENCE: 24 

255 Ser Thr Ala Gly Asn 

256 1 5 

258 <210> 

259 <211> 



Sequence J 



SEQ ID NO: 
LENGTH: 4 



25 



260 <212> TYPE: PRT 



261 <213> 
263 <400> 



ORGANISM: 
SEQUENCE: 




d^Conse 



265 Ser Ala Gly Val 

266 1 

268 <210> SEQ ID NO: 26 

269 <211> LENGTH: 7 

270 <212> TYPE: PR; 

271 <213> ORGANISM 
273 <400> SEQUENCE: 26 " 

275 Leu lie Val Met Phe Tyr Trp 

276 1 5 

278 <210> SEQ ID NO: 27 

279 <211> LENGTH: 7 

280 <212> TYPE: PRT 



sensus Sequence 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/595 , 326A 



DATE: 05/31/2001 
TIME: 12:15:54 



Input Set : C:\PTO.txt 

Output Set: C:\CRF3\05312001\l595326A.raw 



<213> ORGAN I S^CC^ ^risensx^ Sequenc e^^ 
<4 00> SEQUENCE: 27 
Gly Ser Ala Cys He Val Met 
1 5 
<210> SEQ ID NO: 28 

<211> LENGTH: 4 

<212> TYPE: PE — 




29 



Cetfisei 



281 
283 
285 
286 
288 
289 
290 
291 
293 
295 
296 
298 
299 
300 
301 
303 
305 
306 
308 
309 
310 
311 
313 
315 
316 
318 
319 
320 
321 
323 
325 
326 
328 
329 
330 
331 
333 
335 
336 
338 

339 <211> LENGTH: 6 

340 <212> TYPE: PRT 

341 <213> ORGANIS 
343 <400> SEQUENCE: 
34 5 Leu He Met Pro Thr Ala 
346 1 5 

348 <210> SEQ ID NO 

349 <211> LENGTH: 4 

350 <212> TYPE: PRT 



Consensus Sequence 




ensus Sequence 



<213> ORGANIS 
<4 00> SEQUENCE 
Gly Thr He Val 
1 

<210> SEQ ID NO 
<211> LENGTH: 6 
<212> TYPE: PRT 
<213> ORGANISM 
<4 00> SEQUENCE: 
Gly Ser Thr Ala Asn He 
1 5 
<210> SEQ ID NO: 30 
<211> LENGTH: 5 
<212> TYPE: PRT, — •^ 0 ^- — 
<213> ORGANISM (consensus Sequence 
<4 00> SEQUENCE: JO^ 
Leu He Val Met Ala 
1 5 
<210> SEQ ID NO: 31 
<211> LENGTH: 6 
<212> TYPE: PRL 

<213> ORGANISMC^Conserfsus Sequence 
<400> SEQUENCE: 3] 
Leu He Val Trp Pro Gin 
1 5 
<210> SEQ ID NO: 32 

<211> LENGTH: 5 

<212> TYPE: PRT ""v. 
<213> ORGANISM :CC onsensus Sequen cj^ 
<400> SEQUENCE: 32 
Gly Ser Thr Ala Met 
1 5 
<210> SEQ ID NO: 33 




34 
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VERIFICATION SUMMARY DATE: 05/31/2001 

PATENT APPLICATION: US/09/595 , 326A TIME: 12:15:55 

Input Set : C:\PTO.txt 

Output Set: C:\CRF3\05312001\I595326A.raw 
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